Cycling area can be a confounder and effect modifier of the association between helmet use and cyclists’ risk of death after a crash

The effect of helmet use on reducing the risk of death in cyclists appears to be distorted by some variables (potential confounders, effect modifiers, or both). Our aim was to provide evidence for or against the hypothesis that cycling area may act as a confounder and effect modifier of the association between helmet use and risk of death of cyclists involved in road crashes. Data were analysed for 24,605 cyclists involved in road crashes in Spain. A multiple imputation procedure was used to mitigate the effect of missing values. We used multilevel Poisson regression with province as the group level to estimate the crude association between helmet use and risk of death, and also three adjusted analyses: (1) for cycling area only, (2) for the remaining variables which may act as confounders, and (3) for all variables. Incidence–density ratios (IDR) and their 95% confidence intervals were calculated. Crude IDR was 1.10, but stratifying by cycling area disclosed a protective, differential effect of helmet use: IDR = 0.67 in urban areas, IDR = 0.34 on open roads. Adjusting for all variables except cycling area yielded similar results in both strata, albeit with a smaller difference between them. Adjusting for cycling area only yielded a strong association (IDR = 0.42), which was slightly lower in the adjusted analysis for all variables (IDR = 0.45). Cycling area can act as a confounder and also appears to act as an effect modifier (albeit to a lesser extent) of the risk of cyclists’ death after a crash.

www.nature.com/scientificreports/ hypothesized that the protective effect of helmets will be lower at slower speeds of collision, because in this situation the risk of death approaches 0 independently of helmet use. It is not easy to find published examples of actual variables which should be considered potential confounders or effect modifiers. Cycling area (open road vs. urban setting) is an excellent example of a variable which can act a priori as both as a confounder and (indirectly) as an effect modifier in the causal link between helmet use and death. The directed acyclic graph depicted in Fig. 1 illustrates this dual role. Helmet use is mandatory in Spain for cycling on open roads but not in urban areas (except for children under 16 years old). Therefore, the distribution of cycling area is clearly unbalanced between helmeted and non-helmeted cyclists involved in road crashes. Furthermore, cycling area strongly affects the travelling speed of both cyclists and other vehicles on the road, which in turn is the main determinant of cyclists' risk of death after a collision. These facts make cycling area a potentially strong confounder, opening a backdoor (non-causal) path between helmet use and death, and thus biasing toward the null any estimates of the causal path. Furthermore, cycling area is a major ascendant variable that influences collision speed by setting speed limits for each of the two environments. Therefore, the speed at the time of the crash would be related to the amount of kinetic energy at impact 10 and, as hypothesized above, may in turn modify the magnitude of the causal association between helmet use and death.
The aim of this study is to search for evidence for or against the hypothesis that cycling area may act as a confounder and an effect modifier of the association between helmet use and risk of death among cyclists involved in road crashes.

Results
Descriptive information for all study variables is presented in Table 1. Table 2 shows all incidence-density ratios (IDR) estimated to assess the association between helmet use and risk of death among cyclists. Crude IDR estimation yielded a point estimate of 1.10, but this ratio was less than 1 when it was calculated separately for the two strata defined by cycling area. This inverse association was substantially stronger for cycling on open roads than in urban settings, with a P value of 0.024 for the interaction term. After adjustment for all possible confounders except cycling area, IDR showed a moderate inverse relationship between helmet use and risk of death (0.81), but the 95% CI clearly included the null value. Again, stratification of this value according to cycling area revealed a similar pattern to that found in the crude analysis, although with a smaller difference between the two estimates and a higher P value for the interaction term (0.223). When IDR was adjusted only for cycling area, it showed a strong inverse association (0.42; 0.31-0.58), which was only slightly weaker after adjustment for all variables (0.45; 0.32-0.63).

Discussion
In line with most previous studies 1,2 , our final results show an inverse relationship between cyclists' helmet use and death. The magnitude of this association (IDR 0.45 after adjustment for all variables; i.e. risk reduction of 55%) was very similar from that observed in previous meta-analyses and not very different from that reported in Australia after helmet laws were introduced 11 . These estimates are important in the road safety area. To contextualize it, a recent meta-analysis has pointed out that seatbelts reduce fatal injuries by 44% among rear seat occupants 12 .
Although a causal interpretation cannot be ascribed to this association (as is the case for any observational study of this nature), it provides another piece of evidence in favour of the protective effect of helmets on the risk of a cyclist's death after a crash. However, the main utility of our study is to stress the need for observational study designs to give careful consideration to the strong confounding or modifier effect of some variables which may be easily overlooked. Regarding the protective effect of helmet use, cycling area is an excellent example of a confounder. Although mandatory legislation such as that currently in effect in Spain, Israel, Chile or Slovakia 13 might be a main determinant of differences in the prevalence of helmet use depending on cycling area, this is not the only cause. Several other studies have reported similar results, with the higher prevalence of helmet use on open roads 7,14-16 explained by factors such as sport cycling and differences in risk perception. In addition, the association between the area where the crash occurred and its severity is well documented in previous studies [17][18][19][20][21] . However, few studies to date have specifically compared cyclists' road safety on urban and open roads, including rural settings. Bambach et al. 6 considered the effectiveness of helmet use against head injury in rural and urban locations in Australia, but their results were non-significant and the location was not included in the multivariate analysis. In Taiwan, Kuo et al. 22 found that cyclists who sustained head injuries were cycling in the fast lane  www.nature.com/scientificreports/ much more frequently than on rural roads (29.8% on rural roads vs. 2.8% on urban roads). This finding may be associated with crash severity, and thus with a fatal outcome. In Denmark, Kaplan et al. 23 found severe injuries to be more frequent on rural roads than in dense urban settings. The authors hypothesized that safety in numbers might affect their associations, but it seems more plausible to attribute these differences to the speed of the vehicles involved in crashes. In Spain, speed limits in urban areas are set at 50 km/h maximum 24 . Rural areas include open roads which may allow speed limits of almost double (90 km/h). Lastly, although Aldred et al. 3 explicitly recognized the possible confounding role of cycling area on their estimate of the association between helmet use and crash severity, it seems surprising that they did not control for this factor. With respect to other confounders, our results also show that cyclist-and environment-related factors tend to mask the inverse relationship between helmet use and death (i.e., an IDR of 1.10 in the crude estimate vs. 0.81 after adjustment for all variables except cycling area). After adjustment for cycling area, these factors continue to bias the association away from the null, albeit to a very small extent (i.e., an IDR of 0.42 or 0.45). This pattern is consistent with that found in several previous studies 20,23,[25][26][27][28] .
Regarding the hypothetical role of cycling area as an effect modifier, our results also provide evidence in support of this role, although the effect was smaller. The differences in our point estimates of IDR in the two strata defined by cycling area seem to point clearly to a stronger protective effect of helmet use on open roads, where collision speeds are likely higher. This pattern was evident in our crude IDR; however, the differences were smaller for the corresponding adjusted estimates given that their 95% CI overlapped, and the high P value does not allow us to rule out chance as the only explanation for these differences. Unfortunately, comparisons between these findings and previous studies are hampered by the lack of published studies on this topic. We identified only one similar study (from France), in which Amoros et al. 14 found-as we did-that the protective effect of helmet use was much greater in rural areas than on urban roads. These authors identified an interaction between helmet use and area of the crash for the risk of severe injuries.
Apart from its observational design, other limitations related mainly with our data source may compromise the validity of our estimates. Our database is not supported by any coroner´s report, thus, the cause of death cannot be identified objectively, and it could have been due to other unrelated causes. Selection bias may arise because, as in any police-based register, less severe crashes are underrepresented 29,30 . If helmet use causes a true reduction in the severity of injuries, this would lead to underestimation of its protective effect in our study. Regarding information bias, although we used a multiple imputation procedure to reduce bias related to missing values for helmet use and some other variables, this strategy is not useful to control for biases when missing data are generated by a not-at-random mechanism (MNAR), a situation which is highly plausible for the undetermined number of missing values for helmet use.
In summary, our hypothesis regarding the possible confounding effect of cycling area on the association between helmet use and risk of death is supported by our results. The findings for the role of cycling area as an effect modifier, however, are less clear although our results point towards this effect. These results have two practical implications. First, we provide an excellent real-life example for teaching purposes in two topics that are highly relevant to epidemiology, e.g., confounding and effect modification, given that it is not easy to find a variable which can behave in both these ways. Second, our results stress the need to carefully address heterogeneity www.nature.com/scientificreports/ across observational studies in attempts to analyse the magnitude of effect of protective interventions such as helmet use. Our results draw attention to the need for road safety researchers to be alert to the potentially important effect that some easily overlooked variables may have on causal mechanisms inferred from estimates of the magnitude of association. Otherwise, the direction of the estimated associations could be incorrect, as in some of the studies mentioned above 3 . This would be an example of Simpson's paradox 31,32 .

Methods
We analysed the case series comprising all 24,605 cyclists involved in road crashes in Spain from 2014 to 2017, as recorded in the Spanish Register of Victims of Road Crashes maintained by the Spanish National Directorate of Traffic 33 . Except for data from two regions (Catalonia and Basque Country), for which information is lacking for 2014 and 2015, this is a nationwide, anonymized police-based database that contains data on every crash recorded by the national police corps in which at least one person was injured. We excluded crashes that occurred in Ceuta and Melilla-Spanish cities located overseas that have no open roads. Because the database is anonymized and maintained by a third party, and there was no intervention, this study was exempt from the requirement to seek informed consent or ethics committee approval.
Our exposure variable was helmet use (yes/no), and our outcome was death within the first 30 days after the crash (yes/no). Other covariates were individual characteristics of the cyclists, and environmental variables. Further information and details on the selection of variables were reported previously 17 . These variables are summarized in Table 1.
The proportion of missing values for our exposure variable (helmet use) was greater than 25%. Assuming that a non-despicable proportion could have been missed due to a missing-at-random mechanism, we used the Stata´s command ICE 34 to implement a multiple imputation procedure with 50 iterations according to the chained equations method proposed by Van Buuren 35 , as suggested by the existing literature 36 . We considered that there could be differences in our data nested in the province where the crashed occurred, so a multilevel model was used (with cyclist and province as aggregation levels). Because death was an infrequent outcome, we used Poisson regression modelling to obtain the incidence-density ratio with 95% confidence intervals (IDR; 95% CI) in order to quantify the magnitude of association between helmet use and the risk of death. The estimates for each imputed dataset were combined by applying Rubin´s method with the MIM command 34 . We obtained crude IDR estimates for the whole sample, and for two strata according to cycling area, and three additional IDR: adjusting (1) only for cycling area, (2) for the remaining variables which could act as confounders, and (3) for all variables. The P value for the interaction term between helmet use and cycling area was obtained in crude and adjusted models. All statistical analyses were done with Stata software v.14 37 .

Data availability
The data underlying this article were provided by Spanish National Directorate of Traffic by permission. Data will be shared on request to the corresponding author with permission of Spanish National Directorate of Traffic.